Long Short-Term Memory (LSTM)

Besides output, a cell also has "memory".

Formula

Memory: ct=ftct1+itc~t=(Forget Old Info)+(Incorporate New Info)=Lont-Term Memory+Short-Term Meomry \begin{aligned} c_t &= f_t \odot c_{t-1} + i_t \odot \tilde{c}_t \\ &= (\text{Forget Old Info}) + (\text{Incorporate New Info}) \\ &= \boxed{\text{Lont-Term Memory}} + \boxed{\text{Short-Term Meomry}} \end{aligned} where: xt=[wtht1][synthesized input]c~n=tanh(Wcxt+bc)[new experience]ft=σ(Wfxt+bf)[forget control]it=σ(Wixt+bi)[input control] \begin{aligned} x_{t} &= \begin{bmatrix}w_{t} & h_{t-1} \end{bmatrix} &[\text{synthesized input}] \\ \tilde{c}_n &= \tanh(\mathbf{W_c} \cdot x_{t} + b_c) &[\text{new experience}]\\ f_t &= \sigma(\mathbf{W_f} \cdot x_{t} + b_f) &[\text{forget control}] \\ i_t &= \sigma(\mathbf{W_i} \cdot x_{t} + b_i) &[\text{input control}] \\ \end{aligned} Output: ot=σ(Woxt+bo)[output control]ht=ottanh(ct)[hidden vector = output] \begin{aligned} o_t &= \sigma(\mathbf{W_o} \cdot x_{t} + b_o) &[\text{output control}] \\ h_t &= o_t \odot \tanh(c_t) &[\text{hidden vector = output}] \end{aligned}

Gates

Graph

Reference

by Jon